Fix problems when altering a column from binary to varbinary#1628
Fix problems when altering a column from binary to varbinary#1628meiji163 merged 4 commits intogithub:masterfrom
binary to varbinary#1628Conversation
MySQL's binlog strips trailing 0x00 bytes from binary(N) columns. PR github#915 fixed this for unique key columns only, but the same issue affects all binary columns in INSERT/UPDATE operations. Remove the isUniqueKeyColumn condition so all binary(N) columns are padded to their declared length. Fixes a variation of github#909 where the affected column is not a primary key.
There was a problem hiding this comment.
Pull request overview
This PR addresses incorrect handling of BINARY(N) values read from MySQL binlog events during migrations that alter a column to VARBINARY(M), where trailing 0x00 bytes can be stripped and thus written incorrectly to the ghost table. The fix generalizes the existing padding logic so it applies to all binary columns, not just unique-key columns.
Changes:
- Apply trailing-zero padding for all
BinaryColumnTypecolumns during argument conversion (not only unique key columns). - Add unit tests covering binary padding behavior in
convertArg. - Add a new local integration test (
localtests/binary-to-varbinary) to reproduce thebinary -> varbinarybinlog truncation scenario.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
go/sql/types.go |
Expands binary padding logic to apply to all binary columns. |
go/sql/types_test.go |
Adds unit tests verifying padding behavior for truncated/full binary values. |
localtests/binary-to-varbinary/create.sql |
Creates a repro scenario with binlog-driven inserts/updates involving trailing-zero binary values. |
localtests/binary-to-varbinary/extra_args |
Supplies an --alter that changes data from binary(20) to varbinary(32). |
Comments suppressed due to low confidence (1)
go/sql/types.go:87
- The padding logic builds a
bytes.Bufferdirectly fromarg2Bytes. Sincebytes.NewBuffercan reuse the provided slice as its backing store,Writemay mutate the input byte slice in-place (depending on capacity). Now that this code runs for all binary columns, it would be safer to avoid mutating the input by copying into a new buffer/slice before padding (and ideally keep the return type as[]byteto avoid unnecessary string conversions).
buf := bytes.NewBuffer(arg2Bytes)
for i := uint(0); i < (this.BinaryOctetLength - uint(size)); i++ {
buf.Write([]byte{0})
}
arg = buf.String()
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text.
|
Thanks for the fix. I do wonder why the author of #915 chose to apply the padding only to the unique key 🤔 |
I think this comment has a clue:
Guess that wasn't quite true... |
* Fix binary column trailing zero stripping for non-key columns MySQL's binlog strips trailing 0x00 bytes from binary(N) columns. PR #915 fixed this for unique key columns only, but the same issue affects all binary columns in INSERT/UPDATE operations. Remove the isUniqueKeyColumn condition so all binary(N) columns are padded to their declared length. Fixes a variation of #909 where the affected column is not a primary key. * Simplify by removing isUniqueKeyColumn now that it's no longer used. * In convertArg, don't convert binary data to strings. In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text. * fix a lint
* Execute hook on every batch insert retry Co-authored-by: Bastian Bartmann <bastian.bartmann@shopify.com> * Expose the last error message to the onBatchCopyRetry hook Co-authored-by: Bastian Bartmann <bastian.bartmann@shopify.com> * Remove double retries CalculateNextIterationRangeEndValues needs to be recomputed on every retry in case of configuration (e.g. chunk-size) changes were made by onBatchCopyRetry hooks. * include dev.yml (temp for Shopify) * Update doc/hooks.md * Remove dev.yml * Fix retry issue where MigrationIterationRangeMinValues advances before insert completes - extract MigrationContext.SetNextIterationRangeValues outside of applyCopyRowsFunc, so that it doesn't run on retries - add an integration test for Migrator with retry hooks Co-authored-by: Bastian Bartmann <bastian.bartmann@shopify.com> * Add localtest that expects gh-ost to fail on exhausted retries * Rename method * fmt and lint * gofmt * Fix problems when altering a column from `binary` to `varbinary` (#1628) * Fix binary column trailing zero stripping for non-key columns MySQL's binlog strips trailing 0x00 bytes from binary(N) columns. PR #915 fixed this for unique key columns only, but the same issue affects all binary columns in INSERT/UPDATE operations. Remove the isUniqueKeyColumn condition so all binary(N) columns are padded to their declared length. Fixes a variation of #909 where the affected column is not a primary key. * Simplify by removing isUniqueKeyColumn now that it's no longer used. * In convertArg, don't convert binary data to strings. In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text. * fix a lint * Fix 4 trigger handling bugs (#1626) * fix: remove double-transformation in trigger length validation ValidateGhostTriggerLengthBelowMaxLength was calling GetGhostTriggerName on an already-transformed name, adding the suffix twice. This caused valid trigger names (ghost name <= 64 chars) to be falsely rejected. The caller in inspect.go:627 already transforms the name via GetGhostTriggerName before passing it, so the validation function should check the length as-is. Unit tests updated to reflect the correct call pattern: transform first with GetGhostTriggerName, then validate the result. Added boundary tests for exactly 64 and 65 char names. * fix: return error from trigger creation during atomic cut-over During atomic cut-over, if CreateTriggersOnGhost failed, the error was logged but not returned. The migration continued and completed without triggers, silently losing them. The two-step cut-over (line 793) already correctly returns the error. This aligns the atomic cut-over to do the same. * fix: check trigger name uniqueness per schema, not per table validateGhostTriggersDontExist was filtering by event_object_table, only checking if the ghost trigger name existed on the original table. MySQL trigger names are unique per schema, so a trigger with the same name on any other table would block CREATE TRIGGER but pass validation. Remove the event_object_table filter to check trigger_name + trigger_schema only, matching MySQL's uniqueness constraint. * fix: use parameterized query in GetTriggers to prevent SQL injection GetTriggers used fmt.Sprintf with string interpolation for database and table names, causing SQL syntax errors with special characters and potential SQL injection. Switched to parameterized query with ? placeholders, matching the safe pattern already used in inspect.go:553-559. * test: add regression tests for trigger handling bugs Add two integration tests: - trigger-long-name-validation: verifies 60-char trigger names (64-char ghost name) are not falsely rejected by double-transform - trigger-ghost-name-conflict: verifies validation detects ghost trigger name conflicts on other tables in the same schema * style: gofmt context_test.go --------- Co-authored-by: Yakir Gibraltar <yakir.g@taboola.com> Co-authored-by: meiji163 <meiji163@github.com> * fix update of LastIterationRange values --------- Co-authored-by: Jan Grodowski <jan.grodowski@shopify.com>
Description
Fixes #1627.
The issue is that binary values are incorrectly read from the binlog. This PR fixes it by changing a previously-implemented fix to apply to all binary columns, not only primary keys.
Two new tests are added: a unit test for the function being changed, and a localtest. (Oddly, the existing test in
localtests/varbinarydoesn't seem to actually involve anyvarbinarycolumns.)script/cibuildreturns with no formatting errors, build errors or unit test errors.I wasn't able to figure out how to runFigured it out, it was trivial - my version of Go was too new.script/buildorscript/cibuild. They both try to download and use a vendored Go installation, but this doesn't work for me.